AITopics | East China Sea

Collaborating Authors

East China Sea

Scientists ' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning

Neural Information Processing SystemsJun-15-2026, 05:09:42 GMT

Scientific discoveries increasingly rely on complex multimodal reasoning that integrates information-intensive scientific data and domain-specific expertise. Empowered by expert-level scientific benchmarks, scientific Multimodal Large Language Models (MLLMs) hold the potential to significantly enhance this discovery process in realistic workflows. However, current scientific benchmarks mostly focus on evaluating the knowledge understanding capabilities of MLLMs, leading to an inadequate assessment of their perception and reasoning abilities. To address this gap, we present the Scientists First Exam (SFE) benchmark, designed to evaluate the scientific cognitive capacities of MLLMs through three cognitive levels: scientific signal perception, scientific attribute understanding, scientific comparative reasoning. Specifically, SFE comprises 830 expert-verified VQA pairs across three question types, spanning 66 multimodal tasks across five high-value disciplines. Extensive experiments reveal that current state-of-the-art GPT-o3 and InternVL-3 achieve only 34.08% and 26.52% on SFE, highlighting significant room for MLLMs to improve in scientific realms. We hope the insights obtained in SFE will facilitate further developments in AI-enhanced scientific discoveries.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > China (0.69)
Pacific Ocean > North Pacific Ocean > East China Sea (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.40)
Health & Medicine > Therapeutic Area > Neurology (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(3 more...)

Add feedback

Chinese fishing 'militia' formations signal rising gray-zone pressure on Taiwan

FOX NewsMar-15-2026, 10:00:11 GMT

China's People's Armed Forces Maritime Militia deployed thousands of fishing vessels in coordinated formations that could disrupt global shipping lanes, analysts warn.

artificial intelligence, social media, taiwan, (9 more...)

FOX News

Country:

Asia > Middle East > Iran (0.34)
Asia > Middle East > UAE (0.15)
Asia > North Korea (0.14)
(23 more...)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Transportation > Freight & Logistics Services > Shipping (0.88)
Government > Military > Navy (0.70)

Technology:

Information Technology > Communications > Social Media (0.73)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.69)

Add feedback

11704817e347269b7254e744b5e22dac-Paper.pdf

Neural Information Processing SystemsFeb-18-2026, 22:32:13 GMT

Forexample, areal-time communications service maybeinterested in tuning the parameters of a control policy to adapt video quality in real time in order to maximize video quality and minimize latency [10, 17].

artificial intelligence, machine learning, optimization, (17 more...)

Neural Information Processing Systems

Country:

Pacific Ocean > North Pacific Ocean > East China Sea > Yellow Sea > Bohai Sea > Bohai Bay (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)
Asia > China > Bohai Bay (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.95)
Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

China Dives in on the World's First Wind-Powered Undersea Data Center

WIREDOct-28-2025, 11:00:00 GMT

The $226 million project uses ocean breezes and seawater to stay cool. China is submerging data centers into the ocean to keep them cool. China has completed the first phase of construction of what it claims is the world's first underwater data center (UDC). Located in Shanghai's Lin-gang Special Area with a price tag of roughly RMB 1.6 billion ($226 million), it's a significant milestone in the quest for sustainable solutions to the growing energy demands of China's computing infrastructure. Powered entirely by wind energy, the initiative has a total power capacity of 24 megawatts.

china dive, wind-powered undersea data center, wired, (10 more...)

WIRED

Country:

Asia > China > Shanghai > Shanghai (0.30)
Pacific Ocean > North Pacific Ocean > East China Sea (0.05)
North America > United States > New York (0.05)
(4 more...)

Industry:

Information Technology > Services (0.87)
Energy > Renewable > Wind (0.70)

Technology:

Information Technology > Information Management (0.97)
Information Technology > Cloud Computing (0.97)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
(2 more...)

Add feedback

He'd need some LARGE SquarePants: Footage of a sea star with a 'big bottom' sparks hilarity as it's compared to SpongeBob's Patrick

Daily Mail - Science & techAug-5-2025, 11:13:53 GMT

The sea floor is home to all sorts of weird and wonderful creatures. But one in particular has become an online sensation, thanks to its impressive'buttocks'. A big–bottomed sea star has been spotted more than 1,000 metres (3,280ft) below the waves. And it appears to have a backside that will make even the most avid gymgoer jealous. This has led many baffled viewers to compare the creature to Patrick from the animated series Spongebob Squarepants.

pink sea star, scientist, sea star, (15 more...)

Daily Mail - Science & tech

Country:

Pacific Ocean > North Pacific Ocean > East China Sea > Yellow Sea (0.07)
North America > United States > New York (0.06)
South America > Argentina (0.05)
(7 more...)

Technology: Information Technology > Artificial Intelligence > Robots (0.32)

Add feedback

A Dual-Layered Evaluation of Geopolitical and Cultural Bias in LLMs

Kim, Sean, Kim, Hyuhng Joon

arXiv.org Artificial IntelligenceJun-30-2025

As large language models (LLMs) are increasingly deployed across diverse linguistic and cultural contexts, understanding their behavior in both factual and disputable scenarios is essential, especially when their outputs may shape public opinion or reinforce dominant narratives. In this paper, we define two types of bias in LLMs: model bias (bias stemming from model training) and inference bias (bias induced by the language of the query), through a two-phase evaluation. Phase 1 evaluates LLMs on factual questions where a single verifiable answer exists, assessing whether models maintain consistency across different query languages. Phase 2 expands the scope by probing geopolitically sensitive disputes, where responses may reflect culturally embedded or ideologically aligned perspectives. We construct a manually curated dataset spanning both factual and disputable QA, across four languages and question types. The results show that Phase 1 exhibits query language induced alignment, while Phase 2 reflects an interplay between the model's training context and query language. This paper offers a structured framework for evaluating LLM behavior across neutral and sensitive topics, providing insights for future LLM deployment and culturally aware evaluation practices in multilingual contexts.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.21881

Country:

Asia > Japan (0.16)
Asia > China (0.06)
Asia > East Asia (0.05)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

MPT: A Large-scale Multi-Phytoplankton Tracking Benchmark

Yu, Yang, Li, Yuezun, Sun, Xin, Dong, Junyu

arXiv.org Artificial IntelligenceJan-4-2025

Phytoplankton are a crucial component of aquatic ecosystems, and effective monitoring of them can provide valuable insights into ocean environments and ecosystem changes. Traditional phytoplankton monitoring methods are often complex and lack timely analysis. Therefore, deep learning algorithms offer a promising approach for automated phytoplankton monitoring. However, the lack of large-scale, high-quality training samples has become a major bottleneck in advancing phytoplankton tracking. In this paper, we propose a challenging benchmark dataset, Multiple Phytoplankton Tracking (MPT), which covers diverse background information and variations in motion during observation. The dataset includes 27 species of phytoplankton and zooplankton, 14 different backgrounds to simulate diverse and complex underwater environments, and a total of 140 videos. To enable accurate real-time observation of phytoplankton, we introduce a multi-object tracking method, Deviation-Corrected Multi-Scale Feature Fusion Tracker(DSFT), which addresses issues such as focus shifts during tracking and the loss of small target information when computing frame-to-frame similarity. Specifically, we introduce an additional feature extractor to predict the residuals of the standard feature extractor's output, and compute multi-scale frame-to-frame similarity based on features from different layers of the extractor. Extensive experiments on the MPT have demonstrated the validity of the dataset and the superiority of DSFT in tracking phytoplankton, providing an effective solution for phytoplankton monitoring.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.16695

Country:

Pacific Ocean > North Pacific Ocean > East China Sea > Yellow Sea (0.04)
Asia > Macao (0.04)
North America > United States > Virginia (0.04)
(4 more...)

Genre: Research Report (0.84)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Generating Unseen Nonlinear Evolution in Sea Surface Temperature Using a Deep Learning-Based Latent Space Data Assimilation Framework

Zheng, Qingyu, Han, Guijun, Li, Wei, Cao, Lige, Zhou, Gongfu, Wu, Haowen, Shao, Qi, Wang, Ru, Wu, Xiaobo, Cui, Xudong, Li, Hong, Wang, Xuan

arXiv.org Artificial IntelligenceDec-17-2024

Advances in data assimilation (DA) methods have greatly improved the accuracy of Earth system predictions. To fuse multi-source data and reconstruct the nonlinear evolution missing from observations, geoscientists are developing future-oriented DA methods. In this paper, we redesign a purely data-driven latent space DA framework (DeepDA) that employs a generative artificial intelligence model to capture the nonlinear evolution in sea surface temperature. Under variational constraints, DeepDA embedded with nonlinear features can effectively fuse heterogeneous data. The results show that DeepDA remains highly stable in capturing and generating nonlinear evolutions even when a large amount of observational information is missing. It can be found that when only 10% of the observation information is available, the error increase of DeepDA does not exceed 40%. Furthermore, DeepDA has been shown to be robust in the fusion of real observations and ensemble simulations. In particular, this paper provides a mechanism analysis of the nonlinear evolution generated by DeepDA from the perspective of physical patterns, which reveals the inherent explainability of our DL model in capturing multi-scale ocean signals.

artificial intelligence, deepda, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.13477

Country:

North America > United States (0.28)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Fujian Province > Fuzhou (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CharacterBench: Benchmarking Character Customization of Large Language Models

Zhou, Jinfeng, Huang, Yongkang, Wen, Bosi, Bi, Guanqun, Chen, Yuxuan, Ke, Pei, Chen, Zhuang, Xiao, Xiyao, Peng, Libiao, Tang, Kuntian, Zhang, Rongsheng, Zhang, Le, Lv, Tangjie, Hu, Zhipeng, Wang, Hongning, Huang, Minlie

arXiv.org Artificial IntelligenceDec-16-2024

Character-based dialogue (aka role-playing) enables users to freely customize characters for interaction, which often relies on LLMs, raising the need to evaluate LLMs' character customization capability. However, existing benchmarks fail to ensure a robust evaluation as they often only involve a single character category or evaluate limited dimensions. Moreover, the sparsity of character features in responses makes feature-focused generative evaluation both ineffective and inefficient. To address these issues, we propose CharacterBench, the largest bilingual generative benchmark, with 22,859 human-annotated samples covering 3,956 characters from 25 detailed character categories. We define 11 dimensions of 6 aspects, classified as sparse and dense dimensions based on whether character features evaluated by specific dimensions manifest in each response. We enable effective and efficient evaluation by crafting tailored queries for each dimension to induce characters' responses related to specific dimensions. Further, we develop CharacterJudge model for cost-effective and stable evaluations. Experiments show its superiority over SOTA automatic judges (e.g., GPT-4) and our benchmark's potential to optimize LLMs' character customization. Our repository is at https://github.com/thu-coai/CharacterBench.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2412.11912

Country:

Asia > Russia (0.14)
Europe > United Kingdom > England (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(14 more...)

Genre:

Personal > Interview (1.00)
Overview (0.67)
Research Report (0.63)

Industry:

Health & Medicine (1.00)
Government > Military (1.00)
Education (1.00)
Water & Waste Management (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning

Liu, Yuti, Liu, Shice, Gao, Junyuan, Jiang, Pengtao, Zhang, Hao, Chen, Jinwei, Li, Bo

arXiv.org Artificial IntelligenceDec-16-2024

Image Aesthetic Assessment (IAA) is a vital and intricate task that entails analyzing and assessing an image's aesthetic values, and identifying its highlights and areas for improvement. Traditional methods of IAA often concentrate on a single aesthetic task and suffer from inadequate labeled datasets, thus impairing in-depth aesthetic comprehension. Despite efforts to overcome this challenge through the application of Multi-modal Large Language Models (MLLMs), such models remain underdeveloped for IAA purposes. To address this, we propose a comprehensive aesthetic MLLM capable of nuanced aesthetic insight. Central to our approach is an innovative multi-scale text-guided self-supervised learning technique. This technique features a multi-scale feature alignment module and capitalizes on a wealth of unlabeled data in a self-supervised manner to structurally and functionally enhance aesthetic ability. The empirical evidence indicates that accompanied with extensive instruct-tuning, our model sets new state-of-the-art benchmarks across multiple tasks, including aesthetic scoring, aesthetic commenting, and personalized image aesthetic assessment. Remarkably, it also demonstrates zero-shot learning capabilities in the emerging task of aesthetic suggesting. Furthermore, for personalized image aesthetic assessment, we harness the potential of in-context learning and showcase its inherent advantages.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2412.11952

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Pacific Ocean > North Pacific Ocean > East China Sea > Yellow Sea (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.82)

Industry:

Media > Photography (1.00)
Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback